In this paper, the problem of multi-view embedding from different visual cuesand modalities is considered. We propose a unified solution for subspacelearning methods using the Rayleigh quotient, which is extensible for multipleviews, supervised learning, and non-linear embeddings. Numerous methodsincluding Canonical Correlation Analysis, Partial Least Sqaure regression andLinear Discriminant Analysis are studied using specific intrinsic and penaltygraphs within the same framework. Non-linear extensions based on kernels and(deep) neural networks are derived, achieving better performance than thelinear ones. Moreover, a novel Multi-view Modular Discriminant Analysis (MvMDA)is proposed by taking the view difference into consideration. We demonstratethe effectiveness of the proposed multi-view embedding methods on visual objectrecognition and cross-modal image retrieval, and obtain superior results inboth applications compared to related methods.
展开▼